fix(app): make ctx.NewSession wait for agent idle (#63)#64
Conversation
- Add ErrAgentBusy sentinel (shared between internal/app and internal/extensions) so callers can detect the busy condition with errors.Is instead of substring-matching the error message. - Add App.WaitForIdle(timeout) backed by a per-busy-cycle idleCh closed by a new setBusyLocked chokepoint; all busy transitions now route through it to keep the channel in sync with the busy flag. - Have RequestNewSessionFromExtension wait for idle (up to DefaultNewSessionIdleWait = 10m) instead of failing fast on IsBusy. This fixes the v0.79.0 phase-handoff race where OnAgentEnd fires from inside the agent loop, before drainQueue clears busy, so ctx.NewSession reliably failed with 'agent is busy'. - Expose ext.ErrAgentBusy to Yaegi via symbols.go. - Update NewSession godoc and phase-handoff example to document the new wait-then-send behavior. - Add regression tests covering already-idle, blocks-until-drain, timeout, zero-timeout, app-close, headless guard, and idleCh transitions. Fixes #63
|
Connected to Huly®: KIT-65 |
📝 WalkthroughWalkthroughFixes a race in ChangesIdle/busy signaling and NewSession race fix
Sequence Diagram(s)sequenceDiagram
participant Extension as Extension (OnAgentEnd)
participant NewSession as ctx.NewSession
participant RequestNewSession as RequestNewSessionFromExtension
participant WaitForIdle
participant drainQueue
participant idleCh
Extension->>NewSession: go ctx.NewSession(prompt)
NewSession->>RequestNewSession: RPC call
RequestNewSession->>WaitForIdle: WaitForIdle(10 min)
WaitForIdle->>idleCh: select on idleCh / timer / rootCtx.Done
drainQueue->>idleCh: setBusyLocked(false) → close(idleCh)
idleCh-->>WaitForIdle: channel closed, idle
WaitForIdle-->>RequestNewSession: nil
RequestNewSession-->>NewSession: start new session
NewSession-->>Extension: nil (success)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
internal/app/app.go (1)
219-224: ⚡ Quick winWrap shutdown error with call-site context in
WaitForIdle.Line 223 returns
a.rootCtx.Err()directly; wrapping preserveserrors.Iswhile making upstream failures actionable.Suggested patch
case <-a.rootCtx.Done(): if timer != nil { timer.Stop() } - return a.rootCtx.Err() + return fmt.Errorf("wait for idle interrupted by app shutdown: %w", a.rootCtx.Err()) }As per coding guidelines, "Always check errors and wrap them with
fmt.Errorf("context: %w", err)".🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/app/app.go` around lines 219 - 224, The error returned from a.rootCtx.Err() in the context cancellation case of the WaitForIdle method is being returned directly without wrapping. Wrap this error using fmt.Errorf with the %w verb to add contextual information about the shutdown, which makes the error actionable for upstream callers while preserving errors.Is compatibility and following the coding guidelines that require all errors to be wrapped with context.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@internal/app/app.go`:
- Around line 219-224: The error returned from a.rootCtx.Err() in the context
cancellation case of the WaitForIdle method is being returned directly without
wrapping. Wrap this error using fmt.Errorf with the %w verb to add contextual
information about the shutdown, which makes the error actionable for upstream
callers while preserving errors.Is compatibility and following the coding
guidelines that require all errors to be wrapped with context.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: f9986381-5ed6-4d80-a55c-3b501840bc59
📒 Files selected for processing (5)
examples/extensions/phase-handoff.gointernal/app/app.gointernal/app/app_test.gointernal/extensions/api.gointernal/extensions/symbols.go
Description
Fix the v0.79.0 phase-handoff race:
ctx.NewSessioninvoked from anOnAgentEndhandler always failed with"cannot start new session while agent is busy"becauseAgentEndis emitted from inside the agent loop —before
drainQueueclearsapp.busy— and any post-turn tooling(formatters, on-save linters, hidden tool calls) keeps the busy window open
for seconds to minutes after that.
Instead of failing fast,
RequestNewSessionFromExtensionnow blocks untilthe agent is genuinely idle (up to a generous 10-minute internal timeout)
and then sends the
NewSessionRequestEvent. The wait is built on a newApp.WaitForIdle(timeout)primitive backed by a per-busy-cycleidleChchannel that is closed on every
busy → idletransition. All existing busymutations are funnelled through a single
setBusyLockedhelper so thechannel always tracks the flag.
A sentinel
ErrAgentBusyis shared betweeninternal/appandinternal/extensions(and exposed to Yaegi asext.ErrAgentBusy), soextension authors can detect timeouts with
errors.Is(err, ext.ErrAgentBusy)instead of substring-matching the error message.
Before:
After: the same code just works —
NewSessionwaits for the agent tosettle (including any post-turn hooks) and then switches sessions.
Fixes #63
Type of Change
Checklist
ctx.NewSession, sentinel error doc, example comment update)(
go test -race ./...clean,golangci-lint runclean)Additional Information
Files changed
internal/app/app.go—ErrAgentBusysentinel,DefaultNewSessionIdleWait,idleChfield,setBusyLockedchokepoint,WaitForIdle,idleSnapshot;RequestNewSessionFromExtensionnow waitsfor idle instead of failing fast.
internal/extensions/api.go— declare sharedErrAgentBusy; expand theNewSessiongodoc to describe the wait-then-send semantics and remindauthors to call it from a goroutine.
internal/extensions/symbols.go— exposeext.ErrAgentBusyto Yaegi.examples/extensions/phase-handoff.go— update the comment around thego func() { ctx.NewSession(...) }()call to reflect the new behaviour.internal/app/app_test.go— 7 new regression tests covering thealready-idle fast path, blocks-until-drain, timeout →
ErrAgentBusy,zero-timeout-waits-indefinitely, app-close release, headless guard, and
idleChchannel transitions.Backward compatibility
ctx.NewSessionnowsucceeds in cases that previously errored; it still returns the same
error shapes for non-busy failure modes (no TUI attached, before-hook
cancellation, session-file failure).
"agent is busy"will continueto match because the returned error still contains that phrase
(
fmt.Errorf("cannot start new session: %w", ErrAgentBusy)), but newcode should migrate to
errors.Is(err, ext.ErrAgentBusy).Summary by CodeRabbit
Bug Fixes
Documentation
Tests